Did Nicolas Hayek’s Speeches Influence SWATCH’s Stock Price?



“This crisis is in the process of annihilating wealth on a gigantic scale through no fault of the real economy or industrialists and without any sensible measures being visible to control a sector of the financial economy which is run and lauded by just a handful of people.
[…] this new stock market and financial mentality knows only one objective – that is to make money, more money and even more money, as much as possible, whatever the costs. This behavior has a very destructive effect on industry.”

— Nicolas Hayek, 5 September 2008

When Nicolas Hayek made this statement in a speech to the Swiss Business Federation (Economiesuisse) on 5 September 2008, the world was in the midst of the Global Financial Crisis 2007-2009. Arguably, the statement is rather negative towards financial markets. Hence, one might wonder whether such a public statement by the CEO of one of the world’s most successful watch companies may have led to a reaction from financial markets and banks. After all, SWATCH stocks are publicly listed. On the other hand, bankers and financial market participants may have had other worries due to the Global Financial Crisis. In general, it is an interesting question whether public speeches by well-known and influential business men have an effect on financial markets. 1

#

What’s Our Goal?

The idea of this R notebook is to introduce everyone interested in data science to effectively communicate data analytics by creating clear and engaging visualisations. Creating engaging visualisations and telling a story around a statistical analysis make such data analysis much more memorable and enjoyable for the audience seeing them – be it for your team or boss at work, for customers, for your school or research project, for a blog or newspaper article, for the general public or simply for your friends. Usually, around 95% of time spent on data analysis and coding tasks and only around 5% on visualising results. But what your audience often effectively sees is rather 1% analysis or code and 99% visuals (or so and maybe some boring text ;)). Engaging visuals and storytelling are key when it comes to presenting data analysis to an audience. On the side, we also take a look at a way to analyse the influence of Nicolas Hayek’s public speeches on SWATCH’s stock price. For the purpose of visualising analyses and findings, the ggplot and plotly packages (as well as some additional packages) are used since they enable producing high-quality, publication-ready visualisations and are easy to handle. Both packages are built around the framework of the so-called Grammar of Graphics, a scientific syntax for effective data visualisations, which describes how specific elements of a plot should be named for a structured approach to visualisations. For more information, see Hadley Wickham (2010) - A Layered Grammar of Graphics and Wilkinson (2011) - The Grammar of Graphics.

I can also greatly recommend these following resources:

#

Prerequisites

To fully understand this R notebook, some base R and tidyverse synatx is beneficial. Otherwise, it should nonetheless be possible to reproduce and adjust the graphs and steps by simply copying the provided R code and slightly adjusting it. The additionally referenced resources and R packages above and throughout the notebook may also be of help. Moreover, knowledge of finance and statistics greatly helps to easily understand the content.

#

Hint

All programming code can be shown or hidden in the upper right corner of the notebook or, alternatively, by clicking the code button present in each cell.

#

1 Settings

# Turn off warning messages

options(warn = -1)

# Custom function for checking installation of packages and loading them

install_and_load_package <- function(package) {

    # Check whether package is already installed and if not, install it

    if (!require(package, character.only = T)) {

        install.packages(package, dependencies = T)

    }

    # Load specified package

    require(package, character.only = T)

}

# Specify packages needed for analysis in character vector

packages <- c("conflicted",
              "gapminder",
              "httr",
              "quantmod",
              "tidyverse",
              "lubridate",
              "tsbox",
              "tidytext",
              "ggrepel",
              "plotly",
              "viridis",
              "viridisLite",
              "RColorBrewer")

# Install and load needed packages

lapply(packages, install_and_load_package)

# Conflicted: hierarchy in case of conflict

conflict_prefer("filter", "dplyr")
conflict_prefer("select", "dplyr")
conflict_prefer("first", "dplyr")
conflict_prefer("last", "dplyr")
conflict_prefer("lag", "dplyr")
conflict_prefer("flatten", "purrr")
conflict_prefer("layout", "plotly")

# Color settings

palette(viridis(n = 10))

col_palette_red    <- brewer.pal(n = 9, name = "OrRd")
col_palette_yellow <- brewer.pal(n = 9, name = "YlOrRd")
col_palette_green  <- brewer.pal(n = 9, name = "YlGn")
col_palette_blue   <- brewer.pal(n = 9, name = "PuBu")
col_palette_grey   <- brewer.pal(n = 9, name = "Greys")

2 Data Input

Information is valued highly in financial markets and getting accurate publicly- and openly-offered data is not easy. However, there are internet website such as Yahoo Finance, where finanial market data is publicly downloadable. We thus start by gathering publicly available stock price data and data on Nicolas Hayek’s speeches. Subsequently, we transform the data and then start to create a few graphs. Lastly, we answer our question whether Mr. Hayek’s speeches influenced stock prices with a simple visual data analysis.

#

2.1 Data On Stock Prices

# TODO: FIXME: Replace all pipes!!
# Some options for quantmod package
# TODO: ?

options("getSymbols.warning4.0" = F)

To analyse the influence of Mr. Hayek’s speeches on the SWATCH stock price, we start by getting stock data for SWATCH. SWATCH stock data is publicly available from Yahoo Finance and we use the R quantmod package, which offers a simple and convenient interface for getting stock price data. All that is required to download the data is the ticker ( = “UHRN.SW”) of the corresponding financial stock.

getSymbols(Symbols = "UHRN.SW",
           src     = "yahoo",
           verbose = F)

The downloaded stock price data is now available as a data.frame object by calling UHRN.SW. The SWATCH stock data looks like this, with daily observations for each trading day organised in the rows and seven different variables, also called features in the data science context, in the columns.

head(UHRN.SW)
##            UHRN.SW.Open UHRN.SW.High UHRN.SW.Low UHRN.SW.Close UHRN.SW.Volume
## 2007-01-03        55.00        55.20       54.80         55.15         204994
## 2007-01-04        54.95        55.25       54.75         55.25         186117
## 2007-01-05        55.00        55.25       54.15         54.20         182313
## 2007-01-08        54.05        54.90       54.00         54.65         215321
## 2007-01-09        54.65        54.80       54.20         54.55          95947
## 2007-01-10        54.20        54.50       53.90         54.45         152143
##            UHRN.SW.Adjusted
## 2007-01-03         39.80366
## 2007-01-04         39.87583
## 2007-01-05         39.11801
## 2007-01-08         39.44279
## 2007-01-09         39.37062
## 2007-01-10         39.29844

For each of the daily 4’171 observations we have the corresponding date in the Date column, the Openning stock price at trading start on the exchange, the daily Highest and Lowest price, the Close at end of trading, the trading Volume, and finally an Adjusted price, accounting for stock splits, dividends, and similar corporate actions. We later transform the data.frame into a better readable format.

Second, to have a comparable benchmark to the SWATCH stock, we also get SPI (Swiss Performance Index) data (ticker = “SPICHA.SW” through the UBS ETF CH SPI) from Yahoo Finance.

getSymbols(Symbols = "SPICHA.SW",
           src     = "yahoo",
           verbose = F)

2.1.1 Exercise 1

Get Apple stock data (hint: ticker = “AAPL”).

getSymbols(Symbols = "AAPL",
           src     = "yahoo",
           verbose = F)

2.2 Data On Mr. Hayek’s Speeches

The second ingredient for our analysis is data on Mr. Hayek’s public speeches. Data on Mr. Hayek’s speeches will allow us to, first, determine when and where a public speech by Mr. Hayek took place, and second, what the content of the speech was. Six of Mr. Hayek’s speeches are publicy available on the SWATCH Group website. We thus start by setting the base URL for the SWATCH group website.

http_link <- "https://www.swatchgroup.com/en/"

Each of the six speeches of the website can be found under a different URL, which we set in a new data.frame/tibble. Instead of laboriously downloading and parsing also the dates and places of the speeches by writing an automatic script, it is faster to quickly copy and paste that information by hand from the website. The following data.frame/tibble already holds this information.

df_URL_speches <-
    tibble(Date         = c(ymd("2010-04-10"), ymd("2010-03-05"), ymd("2009-08-24"),
                            ymd("2009-05-27"), ymd("2009-03-03"), ymd("2008-09-05")),
           Place        = c("Paris, France", "PSI Colloquim Villingen, Switzerland", "Interlaken, Switzerland",
                            "Berne, Switzerland", "Berne, Switzerland", "Baden, Switzerland"),
           URL_speeches = c("nicolas-g-hayek-sorbonne",
                            "eve-renewable-energy-age",
                            "nicolas-g-hayeks-speech-swiss-ambassadors",
                            "happy-birthday-csem",
                            "nicolas-g-hayek-about-switzerland-and-european-union",
                            "economy-day-swiss-business-federation-economiesuisse"))

df_URL_speches

We can now combine a vector holding URLs for each of the six speeches by concatenating the base URL and the six speech URLs.

v_http_links_final <-
    paste(http_link, df_URL_speches$URL_speeches, sep = "")

Since actually downloading, parsing, and cleaning the text data for Mr. Hayek’s speeches is slightly more involved, we can simply use the pre-written functions below. There’s no need to understand the functions, their only purpose is to download the speeches in HTML-format and get a clean text version of those speeches in a data.frame/tibble.

# Define `read_csv_confd` function to read in speeches from URLs

read_csv_confd <-
    function(path) {
        read_csv(file = path, col_names = "HTML")
    }

Let’s request Hayek’s speeches from the SWATCH Group website and save them as a list of tibbles.

l_df_Hayek_speeches <-
    lapply(v_http_links_final,
           read_csv_confd)

3 Data Wrangling

We continue with transforming the gathered stock price data into an easy-to-use format. This step in data analysis is often called data wrangling and transforms the previously gathered stock price and speech data into the desired format for further analysis.

#

3.1 Transform Data On Stock Prices

We want a tibble as primary data object for our analysis. Tibbles are enhanced data.frames, available in the dplyr (part of tidyverse) and tsbox R packages. They provide a standardised way of storing data from varying sources. I also use the |> syntax, to make the programming code easier to read (see picture below for a short explanation of the |> syntax).

df_data_SWATCH <-
    UHRN.SW |>
    ts_tbl()  |>
    ts_wide() |>
    rename(Date     = time,
           Open     = UHRN.SW.Open,
           High     = UHRN.SW.High,
           Low      = UHRN.SW.Low,
           Close    = UHRN.SW.Close,
           Volume   = UHRN.SW.Volume,
           Adjusted = UHRN.SW.Adjusted)

Let’s quickly look at the SWATCH stock price data. The data formating is now much better and the features easier to read and work with.

df_data_SWATCH

We do the same for the SPI data.

df_data_SPI <-
    SPICHA.SW |>
    ts_tbl() |>
    ts_wide() |>
    rename(Date     = time,
           Open     = SPICHA.SW.Open,
           High     = SPICHA.SW.High,
           Low      = SPICHA.SW.Low,
           Close    = SPICHA.SW.Close,
           Volume   = SPICHA.SW.Volume,
           Adjusted = SPICHA.SW.Adjusted)

Finally, we combine both SWATCH stock price and SPI time series to have them available in a single tibble.

df_data_SWATCH_SPI <-
    df_data_SWATCH |>
    full_join(df_data_SPI,
              by     = "Date",
              suffix = c("_SWATCH", "_SPI"))

We can now compute returns for both stocks.

df_data_SWATCH_SPI <-
    df_data_SWATCH_SPI |>
    mutate(Returns_SWATCH = Adjusted_SWATCH / lag(Adjusted_SWATCH) - 1,
           Returns_SPI    = Adjusted_SPI / lag(Adjusted_SPI) - 1)

3.1.1 Exercise 2

Turn Apple stock data into a tibble with appropriate format.

df_Apple_data <-
    AAPL |>
    ts_tbl() |>
    ts_wide() |>
    rename(Date     = time,
           Open     = AAPL.Open,
           High     = AAPL.High,
           Low      = AAPL.Low,
           Close    = AAPL.Close,
           Volume   = AAPL.Volume,
           Adjusted = AAPL.Adjusted)

3.2 Transform Data On Mr. Hayek’s Speeches

Again, we can use the pre-written functions below to transform the data on Mr. Hayek’s speeches. These functions help to clean the text version of those speeches.

# Define function `detect_HTML_paragraph_indices` to detect start and end of speeches in HTMLs

detect_HTML_paragraph_indices <- function(tibble) {
    df_HTML_paragraph_indices <-
        tibble |>
        mutate(HTML_paragraph = str_detect(HTML, pattern = "^<p>")) |>
        summarise(HTML_paragraph_index       = which(HTML_paragraph),
                  HTML_paragraph_index_first = first(HTML_paragraph_index),
                  HTML_paragraph_index_last  = last(HTML_paragraph_index)) |>
        distinct(HTML_paragraph_index_first,
                 HTML_paragraph_index_last)

    return(df_HTML_paragraph_indices)
}

# Define function `extract_HTML_text` to clean speech texts from HTML and other parts

extract_HTML_text <- function(tibble, text_index_start, text_index_end) {
    df_HTML_text <-
        tibble |>
        slice(text_index_start:text_index_end) |>
        mutate(HTML = str_replace_all(HTML,
                                      pattern = c("^<.{1,5}>"           = "",
                                                  "<.{1,5}>$"           = "",
                                                  "<div.*>"             = "",
                                                  "<strong>.*</strong>" = "",
                                                  "&nbsp;"              = "",
                                                  "<img.*>"             = "",
                                                  "<em.*>"              = "",
                                                  "<.*>"                = "")))  # Recheck this line

    return(df_HTML_text)
}

# Detect start and endings of Hayek's speeches in HTMLs

df_Hayek_speeches_HTML_paragraph_indices <-
    map_dfr(l_df_Hayek_speeches,
            detect_HTML_paragraph_indices)

# l_a <- list(df    = l_df_Hayek_speeches,
#             start = df_Hayek_speeches_HTML_paragraph_indices$HTML_paragraph_index_first,
#             end   = df_Hayek_speeches_HTML_paragraph_indices$HTML_paragraph_index_last)

# pmap(,
#      extract_HTML_text(tibble = df, text_index_start = start, text_index_end = end))

# Clean Hayek's speeches
# TODO: Improve programming! Remove empty lines?

l_df_Hayek_speeches_clean <- list(NA, NA, NA, NA, NA, NA)

for (index in 1:nrow(df_Hayek_speeches_HTML_paragraph_indices)) {
    l_df_Hayek_speeches_clean[[index]] <-
        extract_HTML_text(tibble           = l_df_Hayek_speeches[[index]],
                          text_index_start = df_Hayek_speeches_HTML_paragraph_indices$HTML_paragraph_index_first[index],
                          text_index_end   = df_Hayek_speeches_HTML_paragraph_indices$HTML_paragraph_index_last[index])
}

names(l_df_Hayek_speeches_clean) <-
    df_URL_speches$Date

The speeches are saved in a list of tibbles and can be accessed through the index number of the speech l_df_Hayek_speeches_clean[[index]] in the list object. We could, for example, look at the second of Mr. Hayek’s speeches.

l_df_Hayek_speeches_clean[[2]]

4 Text Analysis

After cleaning and transforming Mr. Hayek’s speech data, we proceed with a simple text analysis of the speech content. We start by splitting one of Mr. Hayek’s speeches into single words.

df_Hayek_speeches_words <-
    l_df_Hayek_speeches_clean[[2]] |>
    unnest_tokens(input    = HTML,
                  output   = word,
                  token    = "words",
                  format   = "html",
                  to_lower = T,
                  drop     = T)

Additionally, a good recommendation is to remove stop words, ‘the’, ‘a’, etc., because they carry less meaning than verbs, nouns, adjectives, etc.

df_Hayek_speeches_words_wo_stop_words <-
    df_Hayek_speeches_words |>
    anti_join(stop_words,
              by = "word")

We may now count the word frequency and keep the 15 most frequent words.

df_Hayek_speeches_words_wo_stop_words_counted <-
    df_Hayek_speeches_words_wo_stop_words |>
    count(word, sort = T) |>
    slice(1:15)

5 Our First Plot - Line Graph

Now we’re ready to create our first graph, a line or time series chart. We use the SWATCH’s stock price data and the ggplot plotting engine. We need the previously mentioned Grammar of Graphics to set up each specific component in the plot. First, we need to map the data to so-called aesthetics in the plot. For a visual overview and corresponding explanations of the different components in ggplot’s Grammar of Graphics, see this Towards Data Science article:

Aesthetics are defined within the aes() function in ggplot and include plot specifications such as what goes on the x-axis and y-axis, what is shown in which colour, how the size of an object in a plot is determined and many more. For our basic time series plot, we simply map the Date column from the stock data to the x-axis and the Adjusted stock price to the y-axis. The only additional component to add to get a finished plot now is a so-called geom (short for geometric objects). Geoms determine the kind of plot we want to display and are added with the set of geom_... functions. Here, we’d like to create a simple line plot with geom_line(). First, we add a new component to the plot by using the + operator, which separates each of the components of the plot. Then we set the line geom and, after saving the plot to a new R object, we have our first plot.

p_basic_time_series_SWATCH <-
    ggplot(data = df_data_SWATCH,
           aes(x = Date, y = Adjusted)) +  # Close
    geom_line()

p_basic_time_series_SWATCH

5.0.1 Exercise 3

Create a time series plot for Apple’s stock price. You can also try to adjust the axis scales, in case you have an idea how to do it.

df_Apple_data |>
    ggplot(aes(x = Date, y = Adjusted)) +
    geom_line() +
    scale_x_date(date_breaks = "1 year",
                 date_labels = "%Y") +
    scale_y_continuous(labels = scales::dollar,
                       breaks = seq(from = 0, to = max(df_Apple_data$Adjusted, na.rm = T), by = 20))

So far, so good. This is what we get by using ggplotdefault settings. However, the plot doesn’t look particularly great, does it? The grey background is rather irritating, the date on the x-axis is only displayed every five years, it’s unclear in what units the y-axis is measured, and in general, there’s no title or anything to really indicate what is exactly shown here. The only information we have is the evolution of the series over a time period of 10 years and its corresponding values on the y-axis. So we need to adjust some basic components of the plot.

Since we already defined our data and aesthetics components, we start by adjusting the scales of the x- and y-axes in a new component, the scales component. This ensures, we get proper units and labels for the x- and y-axis. We copy the ggplot object and code from above and additionally add scale_x_... and scale_y_... functions with proper arguments. The x-axis should be set to dates in year units and the y-axis to a continious scale with USD units.

p_basic_time_series_SWATCH_w_scales <-
    p_basic_time_series_SWATCH +
    scale_x_date(date_breaks = "1 year",
                 date_labels = "%Y") +
    scale_y_continuous(labels = scales::number_format(prefix = "USD "),
                       breaks = scales::pretty_breaks(n = 6))

p_basic_time_series_SWATCH_w_scales

The theme of a plot is yet another component in the Grammar of Graphics. Setting a cleaner theme will help us to get rid of the irritating grey background. Let’s try the theme_classic() function.

p_basic_time_series_SWATCH_w_scales_and_theme <-
    p_basic_time_series_SWATCH_w_scales +
    theme_classic()

p_basic_time_series_SWATCH_w_scales_and_theme

5.0.2 Exercise 4

Let’s create a line plot with the same theme and appropriately adjusted scales for Apple. Try adding a proper title to the plot.

p_time_series_Apple <-
    df_Apple_data |>
    ggplot(aes(x = Date, y = Adjusted)) +
    geom_line() +
    scale_x_date(date_breaks = "1 year",
                 date_labels = "%Y") +
    scale_y_continuous(labels = scales::dollar,
                       breaks = seq(from = 0, to = max(df_Apple_data$Adjusted, na.rm = T), by = 20)) +
    theme_classic() +
    labs(title    = "A Story of Success (and Steve Jobs)",
         subtitle = "Apple's Stock Price",
         y        = "Close (Adjusted)",
         caption  = "© Matthieu Rüttimann")

p_time_series_Apple

theme_classic() is quite a clean and simplistic theme. For the purpose of interpreting a time series plot, however, a theme including a grid may be more appropriate. Thus, in the following plots, we use theme_light() instead. By adding additional theme elements, we make sure the grid lines stay in the background of the plot by slightly fading them out, since they are only meant as supporting the viewer in identifying the scales on the axes. Next, we would also like to add a proper title. Plot main and subtitles as well as axis-labels are set with the labs() function. Next, we accentuate the x- and y-axis by plotting it in thicker size than the background grid lines. Let’s also adjust the label of the y-axis to make it clearer what it represents. Finally, we add a caption with a copyright for the plot. Now we have our first complete time series plot.

p_basic_time_series_SWATCH_w_scales_themed <-
    p_basic_time_series_SWATCH_w_scales +
    theme_light() +
    theme(plot.title       = element_text(face = "bold"),  # Bold plot titles
          axis.line        = element_line(size = 0.75),    # thicker axes
          panel.grid.major = element_line(size = 0.05),    # softer grid lines
          panel.grid.minor = element_line(size = 0.05)) +  # softer grid lines
    labs(title    = "Steady As a Ship...?",
         subtitle = "SWATCH Group Stock Price (Ticker: UHRN.SW)",
         y        = "Close (Adjusted)",
         caption  = "© Matthieu Rüttimann")

p_basic_time_series_SWATCH_w_scales_themed

For the following plots, let’s set a global default ggplot theme, instead of adding it manually to each plot.

theme_set(theme_light())

To improve further on our plot, we can add a so-called benchmark to it. A benchmark is, e.g., another time series to compare the SWATCH stock price to. We use the previously gathered SPI series to do exactly that. In order to compare the stock prices of the two series directly to each other, a rebasing of the prices to a specific time point is required. We choose 2011-07-18, since it is the first day with available observations for the SPI in our data sample.

df_data_SWATCH_SPI_filtered <-
    df_data_SWATCH_SPI |>
    filter(Date >= "2011-07-18") |>
    mutate(Adjusted_SWATCH_Rebased = Adjusted_SWATCH / first(Adjusted_SWATCH),
           Adjusted_SPI_Rebased    = Adjusted_SPI / first(Adjusted_SPI))

df_scale_date <-
    df_data_SWATCH_SPI_filtered |>
    summarise(scale_date_min = min(Date, na.rm = T),
              scale_date_max = max(Date, na.rm = T) + 250)

Let’s first create a rebased line graph of the SWATCH stock, starting on 2011-07-18. In addition to the prior graphs, we add a geom_hline element to indicate the 100%-y-line.

p_time_series_SWATCH_vs_SPI <-
    df_data_SWATCH_SPI_filtered |>
    ggplot(aes(x = Date)) +
    geom_hline(yintercept = 1, size = 1.5, col = "grey", alpha = 0.5) +
    geom_line(aes(y = Adjusted_SWATCH_Rebased), col = col_palette_red[6]) +
    geom_point(aes(x = last(Date),
                   y = last(Adjusted_SWATCH_Rebased)),
               col   = col_palette_red[6],
               shape = 3,
               size  = 2) +
    scale_x_date(date_breaks = "1 year",
                 date_labels = "%Y",
                 limits = c(df_scale_date$scale_date_min, df_scale_date$scale_date_max)) +
    scale_y_continuous(labels = scales::percent,
                       breaks = scales::pretty_breaks(n = 8)) +
    labs(title    = "Was it SWATCH's Time to Perform?",
         subtitle = "SWATCH Stock Price vs. SPI Benchmark",
         y        = "Price Rebased (%)",
         caption  = "© Matthieu Rüttimann") +
    theme(legend.text      = element_text(),
          plot.title       = element_text(face = "bold"),
          axis.line        = element_line(size = 0.75),
          panel.grid.major = element_line(size = 0.05),
          panel.grid.minor = element_line(size = 0.05))

p_time_series_SWATCH_vs_SPI

Now we take the above graph and add the SPI benchmark, as well as text labels.

p_time_series_SWATCH_vs_SPI <-
    p_time_series_SWATCH_vs_SPI +
    geom_line(aes(y = Adjusted_SPI_Rebased), col = col_palette_green[7]) +
    geom_point(aes(x = last(Date),
                   y = last(Adjusted_SPI_Rebased)),
               col   = col_palette_green[7],
               shape = 3,
               size  = 2) +
    geom_text(label = "SWATCH",
              aes(x = last(Date),
                  y = last(Adjusted_SWATCH_Rebased)),
              color = col_palette_red[7],
              size  = 2.5,
              hjust = -0.3) +
    geom_text(label = "SPI",
              aes(x = last(Date),
                  y = last(Adjusted_SPI_Rebased)),
              color = col_palette_green[8],
              size  = 2.5,
              hjust = -0.3)

p_time_series_SWATCH_vs_SPI

Apparently, SWATCH underperformed in comparison to the SPI over the time period from 2011 to 2023. In particular beginning around June 2014, the price for SWATCH declined rather sharply in comparison to the SPI. Over the entire time period, SWATCH equities slightly lost in values while the SPI rose by over 200%.

Finally, we can annotate a background highlighting the time period when the two price series started to diverge. We can do this with the annotate geom. Highlighting areas or specific parts of a chart is a useful element in story telling with data (while engaging titles, proper labels, and colours are another part). A general suggestion is to use colours for specific messages we want to convey to our audience. People usually associate red colours with negative sentiments and green colours with positive ones, even subconsciously. In addition, in particular red colours are usually the first thing the eye picks up when looking at a graph. Finally, except for the colour encoding and visual elements capturing our attention, most people read a graph from left to right, up to down.

p_time_series_SWATCH_vs_SPI +
    annotate(geom  = "rect",
             xmin  = as.Date("2014-06-01"),
             xmax  = as.Date("2018-07-01"),
             ymin  = -Inf,
             ymax  = Inf,
             col   = "grey",
             alpha = 0.05) +
    annotate(geom  = "text",
             label = "Divergence",
             x     = as.Date("2016-07-01"),
             y     = 2.5,
             col   = col_palette_grey[7],
             size  = 4)

5.0.3 Exercise 5

Let’s try adding a title, subtitle, and some text or line annotations to our Apple chart as an exercise. Titles and text annotations are a great way to tell a story in a graph.

text_Apple <- "…Nevertheless, \n some bumps \n occured along \n the road"

p_time_series_Apple +
    labs(title    = "Apple Fared Pretty Well…",
         subtitle = "Apple's Stock Price Over 16 Years",
         y        = "Stock Price") +
    geom_text(x     = as.Date("2018-07-01"),
              y     = 80,
              color = col_palette_blue[4],
              label = text_Apple,
              size  = 3)

6 Our Second Plot - Bar Chart

One of the most common – and useful – graphs is a bar chart. Let’s create one. We need a new geom_col for that purpose. Additionally, we reorder the x-aesthetics and fill arguments by the word frequency n with fct_reorder, adding the argument .desc = T to reverse the ordering.

p_Hayek_speech_words_bar_chart <-
    df_Hayek_speeches_words_wo_stop_words_counted |>
    ggplot(aes(x = fct_reorder(word, n, .desc = T), y = n, fill = fct_reorder(word, n, .desc = T))) +
    geom_col(alpha = 0.9) +
    scale_fill_viridis_d(name      = "Words",
                         direction = -1) +
    labs(title    = "How Does Hayek Speak? What Words Does He Use?",
         subtitle = "Speech at PSI Colloquim Villingen, Switzerland on 03.05.2010, by Nicolas Hayek",
         x        = "Words",
         y        = "Word Frequency") +
    theme(axis.text.x = element_text(angle = 60))  # Turn x-axis labels by 60 degrees

p_Hayek_speech_words_bar_chart

We can add the frequency also as text with geom_text to make the graph more legible.

p_Hayek_speech_words_bar_chart <-
    p_Hayek_speech_words_bar_chart +
    geom_text(aes(y = n + 2.5, label = n),
              size = 2.5)

p_Hayek_speech_words_bar_chart

A hot tip is to invert bar charts, which makes them easier to read and visually more appealing.

p_Hayek_speech_words_bar_chart_inverted <-
    df_Hayek_speeches_words_wo_stop_words_counted |>
    ggplot(aes(x = fct_reorder(word, n), y = n, fill = fct_reorder(word, n))) +
    geom_col(alpha = 0.9) +
    geom_text(aes(y = n + 2.5, label = n),
              size = 2.5) +
    coord_flip() +
    scale_fill_viridis_d(name      = "Words",
                         direction = 1) +
    labs(title    = "How Does Hayek Speak? What Words Does He Use?",
         subtitle = "Speech at PSI Colloquim Villingen, Switzerland on 03.05.2010, by Nicolas Hayek",
         x        = "Words",
         y        = "Word Frequency")

p_Hayek_speech_words_bar_chart_inverted

7 Stock Prices and Hayek’s Speeches

Finally, we’ll try to answer our question whether Nicolas Hayek’s speeches influenced SWATCH’s stock price? Hence, we’ll analyse the relation between Hayek’s speeches and SWATCH’s stock price data.

Disclaimer: We do not have intraday data here and financial markets usually react within a few minutes or seconds to unanticipated news. Hence, to scientifically and accurately evaluate whether Mr. Hayek’s speeches had an influence on SWATCH’s stock price, we’d need to look at intraday tick data, not daily data. So the analysis of this section is only exemplary.

We combine stock and speech data.

df_data_SWATCH_SPI_speeches <-
    df_data_SWATCH_SPI |>
    left_join(df_URL_speches,
              by = "Date")

We can take one of our prior plots of SWATCH’s stock price and add red lines with geom_vline which indicate each date of Hayek’s six speeches.

p_basic_time_series_SWATCH_w_scales_themed +
    geom_vline(xintercept = df_URL_speches$Date,
               color = "red",
               alpha = 0.6)

However, stock prices themselves are not very indicative for the influence which the speeches had. Although one may think that the first of Hayek’s speeches led to a sharp decline of SWATCH’s stock price, the stock market usually soaks up information extremely quickly and the down turn trend could simply occur by chance at the same time as the speech.

We need to look at returns instead of the price to answer our question. One possible way of analysis is to have a look at whether stock returns on the day of one of the speeches was higher than on similar days occurring shortly before or after such a speech.

df_data_SWATCH_SPI_speeches <-
    df_data_SWATCH_SPI_speeches |>
    mutate(speech_day = if_else(!is.na(URL_speeches), "Yes", "No"))

df_data_SWATCH_SPI_speeches |>
    ggplot(aes(x = Date, y = Returns_SWATCH, col = speech_day)) +
    geom_hline(yintercept = 0,
               col        = col_palette_grey[5],
               alpha      = 0.5,
               size       = 2.5) +
    geom_point(alpha = 0.6) +
    geom_vline(xintercept = df_URL_speches$Date,
               color      = "red",
               alpha      = 0.6) +
    scale_x_date(breaks = scales::pretty_breaks(n = 6),
                 limits = c(min(df_URL_speches$Date) - 50,
                            max(df_URL_speches$Date) + 50)) +
    scale_y_continuous(labels = scales::percent,
                       breaks = scales::pretty_breaks(n = 6)) +
    scale_color_manual(name   = "Speech Day",
                       values = c("Yes"  = col_palette_red[7],
                                  "No"   = "black")) +
    labs(title    = "Did Mr. Hayek's Speeches Lead to Higher Stock Returns?",
         subtitle = "Daily SWATCH Returns and Hayek's Speeches, Red Lines Indiacte Speeches",
         y        = "Stock Returns") +
    theme(legend.position = "bottom")

Looking at the graph above, it does not appear that stock returns on days of a speech (in red) where higher than on normal days.

An even better way to analyse whether the speeches had an influence is to look at the range of daily stock prices instead. We define the range here as the High - Low of daily stock prices, which we already have as features in our data sample.

df_data_SWATCH_SPI_speeches <-
    df_data_SWATCH_SPI_speeches |>
    mutate(range = High_SWATCH - Low_SWATCH)

df_data_SWATCH_SPI_speeches |>
    ggplot(aes(x = Date, y = range, col = speech_day)) +
    geom_hline(yintercept = 0,
               col        = col_palette_grey[5],
               alpha      = 0.5,
               size       = 2.5) +
    geom_point(alpha = 0.6) +
    geom_vline(xintercept = df_URL_speches$Date,
               color      = "red",
               alpha      = 0.6) +
    scale_x_date(limits = c(min(df_URL_speches$Date) - 50,
                            max(df_URL_speches$Date) + 50),
                 breaks = scales::pretty_breaks(n = 6)) +
    scale_y_continuous(labels = scales::number_format(prefix = "$ "),
                       breaks = scales::pretty_breaks(n = 6)) +
    scale_color_manual(name   = "Speech Day",
                       values = c("Yes" = col_palette_red[7],
                                  "No"  = "black")) +
    labs(title    = "Did Mr. Hayek's Speeches Lead to Higher Stock Returns?",
         subtitle = "Daily SWATCH High-Low Range and Hayek's Speeches, Red Lines Indiacte Speeches",
         y        = "Daily High - Low (Range of Stock Returns)") +
    theme(legend.position = "bottom")

Again, it appears that Hayek’s speeches did not have a particularly strong relation with the range of stock prices.

Hence, after our simple data analysis, we can conclude that Nicolas Hayek speeches likely did not have an influence on SWATCH’s stock price.


  1. Sources: Swissinfo.ch and Swatch Group Website↩︎